Goto

Collaborating Authors

 accuracy-explainability trade-off


Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off

Neural Information Processing Systems

Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts---particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations. Our experiments demonstrate that Concept Embedding Models (1) attain better or competitive task accuracy w.r.t.


Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off

Neural Information Processing Systems

Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts---particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations.


Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off

#artificialintelligence

Concept Bottleneck Models (or CBMs, [1]) have, since their inception in 2020, become a significant achievement in explainable AI. These models attempt to address the lack of human trust in AI by encouraging a deep neural network to be more interpretable by design. To this aim, CBMs first learn a mapping between their inputs (e.g., images of cats and dogs) and a set of "concepts" that correspond to high-level units of information commonly used by humans to describe what they see (e.g., "whiskers", "long tail", "black fur", etc…). This mapping function, which we will call the "concept encoder'', is learnt by -- you guessed it:)-- a differentiable and highly expressive model such as a deep neural network! It is through this concept encoder that CBMs then solves downstream tasks of interest (e.g., classify an image as a dog or a cat) by mapping concepts to output labels: The label predictor used to map concepts to task labels can be your favourite differentiable model, although in practice it tends to be something simple like a single fully connected layer.